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Hierarchical, or nested, data structures are common in many areas of research. Until 
recently, however, an appropriate technique for analyzing these types of data has been 
lacking. Now that several user-friendly software programs and more readable texts and 
treatments on the topic have become available, researchers will benefit from a greater 
understanding of hierarchical modeling and its applications. This Digest introduces 
hierarchical data structure, describes how hierarchical models work, and presents three 
approaches to analyzing hierarchical data. 

WHAT IS A HIERARCHICAL DATA 
STRUCTURE? 



People exist within organizational structures such as families, schools, businesses, 
churches, towns, states, and countries. In education, students exist within a hierarchical 
social structure that can include family, peer group, classroom, grade level, school, 
school district, state, and country. Many other communities exhibit hierarchical data 
structures as well. 

Bryk and Raudenbush (1992) discuss two other types of data hierarchies that are less 
obvious: repeated-measures data and meta-analytic data. Data repeatedly gathered on 
an individual is hierarchical because all the observations are nested within individuals. 
While there are other adequate procedures for dealing with this sort of data, the 
assumptions relating to them are rigorous, whereas procedures relating to hierarchical 
modeling require fewer assumptions. When researchers are engaged in the task of 
meta-analysis, or analysis of a large number of existing studies, subjects, results, 
procedures, and experimenters are nested within each experiment. 

WHY IS A HIERARCHICAL DATA STRUCTURE 
AN ISSUE? 



Hierarchical, or nested, data present several problems for analysis. First, people or 
creatures that exist within hierarchies tend to be more similar to each other than people 
randomly sampled from the entire population. For example, students in a particular 
third-grade classroom are more similar to each other than to students randomly 
sampled from the school district as a whole or from the national population of 
third-graders because they are not randomly assigned to classrooms from the 
population, but rather, based on geographic factors. Thus, students within a particular 
classroom tend to come from a community or community segment that is more 
homogeneous in terms of morals and values, family background, socioeconomic status, 
race or ethnicity, religion, and even educational preparation than the population as a 
whole. Further, students within a particular classroom share the same teacher and 
physical environment and have similar experiences, which may lead to increased 
homogeneity over time. 
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The problem of independence of observations. 

Because individuals drawn from the same classroom or school tend to share certain 
characteristics (environmental, background, experiential, demographic, or otherwise), 
observations based on these individuals are not fully independent. However, most 
analytic techniques require independence of observations as a primary assumption for 
the analysis. Because this assumption is violated in the presence of hierarchical data, 
ordinary least squares regression (OLS) produces standard errors that are too small 
(unless these so-called design effects are incorporated into the analysis). In turn, this 
leads to a higher probability of rejection of a null hypothesis than if: (a) an appropriate 
statistical analysis were performed, or (b) the data included truly independent 
observations. 



The problem of how to deal with cross-level data. 

Going back to the example of our third-grade classroom, it is often the case that a 
researcher is interested in understanding how environmental variables (e.g., teaching 
style, teacher behaviors, class size, class composition, district policies or funding, or 
even state or national variables) affect individual outcomes (e.g., achievement, 
attitudes, retention). But given that outcomes are gathered at the individual level, and 
other variables exist at the classroom, school, district, state, or nation level, the question 
arises as to what the unit of analysis should be, and how to deal with the cross-level 
nature of the data. 

One strategy would be to assign classroom or teacher (or school, district, or other) 
characteristics to all students (i.e., to bring the higher-level variables down to the 
student level). The problem with this approach, again, is non-independence of 
observations, because all students within a particular classroom assume identical 
scores on a variable. 

Another strategy would be to aggregate up to the level of the classroom, school, or 
district, thus enabling us to talk about the effect of teacher or classroom characteristics 
on average classroom achievement. However, this approach has two limitations: (a) up 
to 80 to 90 percent of the individual variability on the outcome variable is lost, which can 
lead to dramatic under- or over-estimation of observed relationships between variables 
(Bryk & Raudenbush, 1992), and (b) the outcome variable changes significantly and 
substantively from individual achievement to average classroom achievement. 

Aside from these problems, both strategies prevent the researcher from disentangling 
individual and group effects on the outcome of interest. As neither one of these 
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approaches is satisfactory, the third approach, that of hierarchical linear modeling 
(HLM), becomes necessary. 

HOW DO HIERARCHICAL MODELS WORK? 



The basic concept behind hierarchical modeling is similar to that of OLS regression. On 
the base level (usually the individual level, referred to here as level 1), an outcome 
variable is predicted as a function of a linear combination of one or more level 1 
variables, plus an intercept, as so: 

Y_ij= b_0j + b_1jX_1 + ... + b_kjX_k +r_ij 



where b_0j represents the intercept of group j, b_1 j represents the slope of variable X_1 
of group j, and rjj represents the residual for individual i within group j. On subsequent 
levels, the level 1 slope(s) and intercept become dependent variables being predicted 
from level 2 variables: 

b_0j = g_00 + g_01 W_1 + ... + g_0kW_k + u_0j 
b_1j = g_10 + g_1 1 W_1 + ... + g_1 kW_k + u_1j 



and so forth, where g_00 and g_10 are intercepts, and g_01 and g_1 1 represent slopes 
predicting b_0j and b_1j respectively from variable W_1. Through this process, we 
accurately model the effects of level 1 variables on the outcome, and the effects of level 
2 variables on the outcome. In addition, as we are predicting slopes as well as 
intercepts (means), we can model cross-level interactions, whereby we can attempt to 
understand what explains differences in the relationship between level 1 variables and 
the outcome. 

AN EMPIRICAL COMPARISON OF THE THREE 
APPROACHES TO 

ANALYZING HIERARCHICAL DATATo illustrate the outcomes achieved by each of the 
three possible analytic strategies for dealing with hierarchical data, disaggregation 
(bringing level 2 data down to level 1), aggregation, and multilevel modeling, data were 
drawn from the National Education Longitudinal Survey of 1988. This data set contains 
data on a representative sample of approximately 28,000 U.S. eighth graders at a 
variety of levels, including individual, family, teacher, and school. The analysis we 
performed predicted composite achievement test scores (math and reading combined) 
from student socioeconomic status (family SES), student locus of control (LOCUS), the 
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percent of students in the school who are members of racial or ethnic minority groups 
(%MINORITY), and the percent of students in a school who receive free lunch 
(%LUNCH). Achievement is our outcome, SES and LOCUS are level 1 predictors, and 
%MINORITY and %LUNCH are level 2 indicators of school environment. In general, 
SES and LOCUS are expected to be positively related to achievement, and 
%MINORITY and %LUNCH are expected to be negatively related to achievement. In 
these analyses, 995 of a possible 1 ,004 schools were represented (the remaining nine 
were removed due to insufficient data). 



Disaggregated analysis. 

In order to perform the disaggregated analysis, the level 2 values were assigned to all 
individual students within a particular school (which is how the NELS data set comes). A 
standard multiple regression was performed via SPSS entering all predictor variables 
simultaneously. The resulting model was significant, with R=.56, R2=.32, F 
(4,22899)=2648.54, p < .0001 . The individual regression weights and significance tests 
are presented in the following table. 

{See Table at end of Digest} 

Note: B refers to an unstandardized regression coefficient, and is used for the HLM 
analysis to represent the unstandardized regression coefficients produced therein, even 
though these are commonly labeled as betas and gamma's. SE refers to standard error. 
Bs with different subscripts were found to be significantly different from other Bs within 
the row at p< .05. 

All four variables were significant predictors of student achievement. As expected, SES 
and LOCUS were positively related to achievement, while %MINORITY and %LUNCH 
were negatively related. 



Aggregated analysis. 

In order to perform the aggregated analysis, all level 1 variables (achievement, LOCUS, 
SES) were aggregated up to the school level (level 2) by averaging. A standard multiple 
regression was performed via SPSS entering all predictor variables simultaneously. The 
resulting model was significant, with R=.87, R2- 75, F (4,999)=746.41 , p < .0001. As 
seen in Table 1, both average SES and average LOCUS were significantly positively 
related to achievement, and %MINORITY was negatively related. In this analysis, 
%LUNCH was not a significant predictor of average achievement. 
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Multilevel analysis. 

In order to perform the multilevel analysis, a true multilevel analysis was performed via 
HLM, in which the respective level 1 and level 2 variables were specified appropriately. 
Note also that all level 1 predictors were centered at the group mean, and all level 2 
predictors were centered at the grand mean. The resulting model demonstrated 
goodness of fit (Chi-square for change in model fit =4231 .39, 5 df, p <.0001). This 
analysis reveals significant positive relationships between achievement and the level 1 
predictors (SES and LOCUS), and strong negative relationships between achievement 
and the level 2 predictors (%MINORITY and %LUNCH). Further, the analysis revealed 
significant interactions between SES and both level 2 predictors, indicating that the 
slope for SES gets weaker as %LUNCH and as %MINORITY increases. Also, there 
was an interaction between LOCUS and %MINORITY, indicating that as %MINORITY 
increases, the slope for LOCUS weakens. There is no clearly equivalent analogue to R 
and R2 available in HLM. 

COMPARISON OF THE THREE ANALYTIC 
STRATEGIES AND CONCLUSIONS 



For the purposes of this discussion, we will assume that the third analysis represents 
the best estimate of what the "true" relationships are between the predictors and the 
outcome. Unstandardized regression coefficients (Bs in OLS, betas and gamma's in 
HLM) were compared statistically via procedures outlined in Cohen and Cohen (1983). 
In examining what is probably the most common analytic strategy for dealing with data 
such as these, the disaggregated analysis provided the best estimates of the level 1 
effects in an OLS analysis. However, it significantly overestimated the effect of SES, 
and significantly and substantially underestimated the effects of the level 2 effects. The 
standard errors in this analysis are generally lower than they should be, particularly for 
the level 2 variables. 

In comparison, the aggregated analysis overestimated the multiple correlation by more 
than 100%, overestimated the regression slope for SES by 79% and for LOCUS by 
76%, and underestimated the slopes for %MINORITY by 32% and for %LUNCH by 
98%. 

These analyses reveal the need for multilevel analysis of multilevel data. Neither OLS 
analysis accurately modeled the true relationships between the outcome and the 
predictors. Additionally, HLM analyses provide other benefits, such as easy modeling of 
cross-level interactions, which allow for more interesting questions to be asked of the 
data. With nested and hierarchical data common in the social and other sciences, and 
with recent developments making HLM software packages more user-friendly and 
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accessible, it is important f or researchers in all fields to become acquainted with these 
procedures. 
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Variable: B SE t B SE t B SE t 



SES 4.97a .08 62.11 7.28b .26 
27.91 4.07c .10 41.29 

LOCUS 2.96a .08 37.71 4.97b .49 
10.22 2.82 .08 35.74 

%MINORITY -0.45a .03 -15.53 
-0.40a .06 -8.76 -0.59 .07 -8.73 

%LUNCH -0.43a .03 -13.50 0.03b 
.05 0.59 -1.32c .07 -19.17 
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